Arrays in Compiler

  • How does compiler work with arrays

How does the compiler work with arrays


This applies to one dimensional arrays.

Simple variable

int x = 10;

We use names to represent a variable. But compiler uses addresses to refer to the location of the variable. It does not use the name to refer to the variable. Its uses the address of the variable to identify the variable.

Compiler needs to convert the variable name to address to allocate memory. And when is memory allocated? During execution time. NOT compile time.

Then only can the address be known.

But now how can we know the address at compile time?

Lets apply this to array:

int A[5] = {3,4,5,6,7}

compiler.png

So if I have code:

A[3] = 10

How does the compiler get this addresss?

This code need to be converted to machine code?

So compiler needs:

  1. Address of the location

At compile time, this is not known. So the comipiler has a formula for obtaining the address, and does this calculation at EXECUTION time.

Formula for calculation of array address by compiler


Picture So we have

A[0] = 200 - 201

A[1] = 202 - 203

A[2] = 204 - 205

A[3] = 206 - 207

A[4] = 208 - 209

We want to set:

$$A[3] = 10$$

Lets call the first address:

$$L_0 = 200$$

This is the starting point (base) for any calculation for address in the array

$$\text{Address of A[3]} = 200 + 3\times 2$$$$\text{Address of A[3]} = 206$$$$ $$$$ $$

Address of any location - FORMULA USED BY COMPILER:

$$\text{Address of A[i]} = L_0 + i \times w $$$$\text{where } w = \text{size of data type}$$$$\text{where } i = \text{index of array}$$$$ $$$$ $$

This is not actual address, but logical address, which is relative to the base address.

This formula is for compiler like C++ where index starts at 0

Whats formula for when index starts at 1

Formula for calculation of array address by compiler

Where index starts at 1, we have the following formula:

Picture

$$\text{Address of A[3]} = 200 + (3-1)\times 2$$$$\text{Address of A[i]} = L_0 + i \times w $$$$ $$$$ $$

Address of any location - FORMULA USED BY COMPILER:

$$\text{Address of A[i]} = L_0 + (i-1) \times w $$$$\text{where } w = \text{size of data type}$$$$\text{where } i = \text{index of array}$$$$ $$$$ $$

Why are the index starting from 0 in C++

There is an extra arithmetic operation (i.e. -1), when the index starts at 1.

Although its only one extra operation, if there is a large array, it will be slower, for array where the index starts at 0.

This is one of the reasons why C++ is a faster language than most other languages.